Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

erikdesjardins · 2024-02-25T06:46:21Z

This is an extension of #98615, extending the removal from field offsets to most places that it's feasible right now. (It might make sense to split this PR up, but I want to test perf with everything.)

For alloca, byval, and sret, the type has no semantic meaning, only the size matters*†. Using [N x i8] is a more direct way to specify that we want N bytes, and avoids relying on LLVM's layout algorithm. Particularly for alloca, it is likely that a future LLVM will change to a representation where you only specify the size.

For GEPs, upstream LLVM is in the beginning stages of migrating to ptradd. LLVM 19 will canonicalize all constant-offset GEPs to i8, which is the same thing we do here.

*: Since we always explicitly specify the alignment. For byval, this wasn't the case until #112157.

†: For byval, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue here.

r? @ghost

erikdesjardins · 2024-02-25T06:47:25Z

compiler/rustc_codegen_ssa/src/mir/place.rs

-            let llval = match self.layout.abi {
-                _ if offset.bytes() == 0 => {
-                    // Unions and newtypes only use an offset of 0.
-                    // Also handles the first field of Scalar, ScalarPair, and Vector layouts.
-                    self.llval
-                }
-                Abi::ScalarPair(..) => {
-                    // FIXME(nikic): Generate this for all ABIs.
-                    bx.inbounds_gep(bx.type_i8(), self.llval, &[bx.const_usize(offset.bytes())])
-                }
-                Abi::Scalar(_) | Abi::Vector { .. } if field.is_zst() => {
-                    // ZST fields (even some that require alignment) are not included in Scalar,
-                    // ScalarPair, and Vector layouts, so manually offset the pointer.
-                    bx.gep(bx.cx().type_i8(), self.llval, &[bx.const_usize(offset.bytes())])
-                }
-                Abi::Scalar(_) => {
-                    // All fields of Scalar layouts must have been handled by this point.
-                    // Vector layouts have additional fields for each element of the vector, so don't panic in that case.
-                    bug!(
-                        "offset of non-ZST field `{:?}` does not match layout `{:#?}`",
-                        field,
-                        self.layout
-                    );
-                }
-                _ => {
-                    let ty = bx.backend_type(self.layout);
-                    bx.struct_gep(ty, self.llval, bx.cx().backend_field_index(self.layout, ix))
-                }
+            let llval = if offset.bytes() == 0 {
+                self.llval
+            } else {
+                bx.inbounds_ptradd(self.llval, bx.const_usize(offset.bytes()))


The old code used non-inbounds GEP for ZSTs. But it's okay for inbounds to point one-past-the end:

The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end.

...which handles the ZST being "outside" of e.g. struct Foo(u64, ()), so I think using inbounds is okay. Unless ZSTs can be arbitrarily far outside the bounds of a layout?

erikdesjardins · 2024-02-25T06:47:58Z

tests/assembly/stack-protector/stack-protector-heuristics-effect.rs

+//@ ignore-test
+// FIXME: The LLVM stack protector code assumes that alloca types are meaningful,
+// so using [n x i8] types causes it to emit stack protection code for all allocas.
+// It needs to be changed to use some meaningful heuristic.


This heuristic should be in the frontend rather than looking at alloca types, it'll require LLVM changes

Judging from #114903 (comment), and the fact that we're only going to stabilize stack-protector=all right now, this issue seems to be well known, so making this change shouldn't be an issue.

erikdesjardins · 2024-02-25T06:48:23Z

tests/codegen/i128-x86-align.rs

@@ -6,7 +6,6 @@
 // correctly.

 // CHECK: %ScalarPair = type { i32, [3 x i32], i128 }


This struct type still exists because...

erikdesjardins · 2024-02-25T06:48:50Z

tests/codegen/i128-x86-align.rs

    // CHECK:      [[LOAD:%.*]] = load volatile %ScalarPair, ptr %x, align 16
    // CHECK-NEXT: store %ScalarPair [[LOAD]], ptr [[TMP]], align 16


...we allow volatile operations on structs for some reason, which is incredibly cursed.

I assume it gets split into multiple loads/stores in the backend if the struct is too big...

erikdesjardins · 2024-02-25T06:49:38Z

tests/codegen/simd/unpadded-simd.rs

 // CHECK: %int16x4x2_t = type { <4 x i16>, <4 x i16> }
 #[no_mangle]
-fn takes_int16x4x2_t(t: int16x4x2_t) -> int16x4x2_t {
+extern "unadjusted" fn takes_int16x4x2_t(t: int16x4x2_t) -> int16x4x2_t {


This test only incidentally used the struct type (via byval, so it would be removed by these changes), but the original motivation (#87254) was for the unadjusted ABI, where we use the struct type directly and pass the vectors by value. Changed it to test that.

erikdesjardins · 2024-02-25T06:50:26Z

tests/codegen/zst-offset.rs

@@ -34,7 +34,7 @@ pub struct U64x4(u64, u64, u64, u64);
 // CHECK-LABEL: @vector_layout
 #[no_mangle]
 pub fn vector_layout(s: &(U64x4, ())) {
-// CHECK: getelementptr i8, {{.+}}, [[USIZE]] 32
+// CHECK: getelementptr inbounds i8, {{.+}}, [[USIZE]] 32


As mentioned above, these ZSTs are one past the end, so inbounds should be fine. Not sure if there are untested cases where they can be further than that.

the8472 · 2024-02-25T17:34:38Z

@bors try @rust-timer queue

bors · 2024-02-25T17:35:47Z

⌛ Trying commit 3ae7ba6 with merge e102148...

Stop using LLVM struct types for alloca, byval, sret, and many GEPs This is an extension of rust-lang#98615, extending the removal from field offsets to most places that it's feasible right now. (It might make sense to split this PR up, but I want to test perf with everything.) For `alloca`, `byval`, and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's layout algorithm. Particularly for `alloca`, it is likely that a future LLVM will change to a representation where you only specify the size. For GEPs, upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which is the same thing we do here. \*: Since we always explicitly specify the alignment. For `byval`, this wasn't the case until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue here. r? `@ghost`

bors · 2024-02-25T19:03:29Z

☀️ Try build successful - checks-actions
Build commit: e102148 (e10214888084afce9a7b30e134248f0d9f5c1314)

rust-timer · 2024-02-25T20:43:58Z

Finished benchmarking commit (e102148): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.1%, 0.4%]	14
Regressions ❌ (secondary)	0.4%	[0.1%, 1.3%]	36
Improvements ✅ (primary)	-0.9%	[-2.4%, -0.2%]	20
Improvements ✅ (secondary)	-0.8%	[-1.6%, -0.3%]	12
All ❌✅ (primary)	-0.4%	[-2.4%, 0.4%]	34

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	1
Improvements ✅ (primary)	-2.8%	[-4.4%, -1.8%]	3
Improvements ✅ (secondary)	-2.1%	[-2.1%, -2.1%]	1
All ❌✅ (primary)	-2.8%	[-4.4%, -1.8%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	5.6%	[2.2%, 8.4%]	7
Improvements ✅ (primary)	-1.2%	[-3.1%, -0.6%]	25
Improvements ✅ (secondary)	-1.5%	[-2.3%, -1.0%]	6
All ❌✅ (primary)	-1.2%	[-3.1%, -0.6%]	25

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.1%, 1.0%]	12
Regressions ❌ (secondary)	0.3%	[0.0%, 1.0%]	22
Improvements ✅ (primary)	-0.3%	[-0.7%, -0.0%]	15
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.7%, 1.0%]	27

Bootstrap: 651.457s -> 643.8s (-1.18%)
Artifact size: 311.11 MiB -> 312.90 MiB (0.58%)

This always produces zero offset, regardless of what the struct layout is. Originally, this may have been necessary in order to change the pointer type, but with opaque pointers, it is no longer necessary.

Always generate GEP i8 / ptradd for struct offsets This implements rust-lang#98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Split out from rust-lang#121577. r? `@nikic`

Always generate GEP i8 / ptradd for struct offsets This implements rust-lang#98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Fixes rust-lang#121719. Split out from rust-lang#121577. r? `@nikic`

bors · 2024-03-04T01:20:03Z

☔ The latest upstream changes (presumably #121665) made this pull request unmergeable. Please resolve the merge conflicts.

Always generate GEP i8 / ptradd for struct offsets This implements rust-lang#98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Fixes rust-lang#121719. Split out from rust-lang#121577. r? `@nikic`

Use GEP inbounds for ZST and DST field offsets ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case. DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it. Split out from rust-lang#121577 / rust-lang#121665. r? `@oli-obk` cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`? Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout: > The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction) For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`: > Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

Stop using LLVM struct types for byval/sret For `byval`, and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would also matter if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`

erikdesjardins · 2024-03-06T02:07:42Z

All changes have been split out to separate PRs.

Stop using LLVM struct types for alloca The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation. Split out from rust-lang#121577. r? `@ghost`

cleanup: remove zero-offset GEP This GEP would've been used to change the pointer type in the past, but after opaque pointers it's a no-op. I missed removing this in rust-lang#105545. Split out from rust-lang#121577.

Rollup merge of rust-lang#122051 - erikdesjardins:cleanup, r=nikic cleanup: remove zero-offset GEP This GEP would've been used to change the pointer type in the past, but after opaque pointers it's a no-op. I missed removing this in rust-lang#105545. Split out from rust-lang#121577.

Use GEP inbounds for ZST and DST field offsets ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case. DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it. Split out from rust-lang#121577 / rust-lang#121665. r? `@oli-obk` cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`? Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout: > The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction) For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`: > Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

Stop using LLVM struct types for byval/sret For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`

Stop using LLVM struct types for array/pointer offset GEPs ...and just use a byte array with the same size as the element type instead. This avoids depending on LLVM's struct layout to determine the size of the array/pointer element. Spiritually split out from rust-lang#121577. r? `@nikic`

Stop using LLVM struct types for alloca The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation. Split out from rust-lang#121577. r? `@ghost`

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 25, 2024

erikdesjardins commented Feb 25, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 25, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 25, 2024

erikdesjardins force-pushed the struct branch from 3ae7ba6 to d9165fc Compare February 27, 2024 03:27

erikdesjardins added 7 commits February 26, 2024 22:28

always use gep inbounds i8 (ptradd) for field offsets

123015e

remove struct_gep, use manual layout calculations for va_arg

beed25b

introduce and use ptradd/inbounds_ptradd instead of gep

89c6eb5

use [N x i8] for alloca types

7766d25

FIXME: ignore stack protector test

72e8eff

use [N x i8] for byval/sret types

dbbb7de

remove all-zero GEP

3c1ff4f

This always produces zero offset, regardless of what the struct layout is. Originally, this may have been necessary in order to change the pointer type, but with opaque pointers, it is no longer necessary.

erikdesjardins force-pushed the struct branch from d9165fc to 3c1ff4f Compare February 27, 2024 03:32

erikdesjardins mentioned this pull request Feb 27, 2024

Always generate GEP i8 / ptradd for struct offsets #121665

Merged

This was referenced Mar 5, 2024

Use GEP inbounds for ZST and DST field offsets #122048

Merged

Stop using LLVM struct types for byval/sret #122050

Merged

erikdesjardins mentioned this pull request Mar 6, 2024

cleanup: remove zero-offset GEP #122051

Merged

erikdesjardins mentioned this pull request Mar 6, 2024

Stop using LLVM struct types for alloca #122053

Merged

erikdesjardins closed this Mar 6, 2024

erikdesjardins deleted the struct branch March 6, 2024 02:07

This was referenced Mar 11, 2024

Stop using LLVM struct types for array/pointer offset GEPs #122325

Open

Use ptradd for vtable indexing #122320

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

erikdesjardins commented Feb 25, 2024

erikdesjardins Feb 25, 2024 •

edited

Loading

erikdesjardins Feb 25, 2024

erikdesjardins Feb 28, 2024

erikdesjardins Feb 25, 2024

erikdesjardins Feb 25, 2024 •

edited

Loading

erikdesjardins Feb 25, 2024

erikdesjardins Feb 25, 2024 •

edited

Loading

the8472 commented Feb 25, 2024

This comment has been minimized.

bors commented Feb 25, 2024

bors commented Feb 25, 2024

This comment has been minimized.

rust-timer commented Feb 25, 2024

bors commented Mar 4, 2024

erikdesjardins commented Mar 6, 2024

		@@ -6,7 +6,6 @@
		// correctly.

		// CHECK: %ScalarPair = type { i32, [3 x i32], i128 }

		// CHECK: [[LOAD:%.*]] = load volatile %ScalarPair, ptr %x, align 16
		// CHECK-NEXT: store %ScalarPair [[LOAD]], ptr [[TMP]], align 16

Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

Conversation

erikdesjardins commented Feb 25, 2024

erikdesjardins Feb 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikdesjardins Feb 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikdesjardins Feb 25, 2024 • edited Loading

Choose a reason for hiding this comment

the8472 commented Feb 25, 2024

This comment has been minimized.

bors commented Feb 25, 2024

bors commented Feb 25, 2024

This comment has been minimized.

rust-timer commented Feb 25, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

bors commented Mar 4, 2024

erikdesjardins commented Mar 6, 2024

erikdesjardins Feb 25, 2024 •

edited

Loading

erikdesjardins Feb 25, 2024 •

edited

Loading

erikdesjardins Feb 25, 2024 •

edited

Loading